Skip to content

Feed model_id and variant_label to recipe#309

Merged
mo-nikosbaltas merged 33 commits intomainfrom
287-feed-model_id-and-variant_label-to-recipe
Jan 8, 2026
Merged

Feed model_id and variant_label to recipe#309
mo-nikosbaltas merged 33 commits intomainfrom
287-feed-model_id-and-variant_label-to-recipe

Conversation

@mo-nikosbaltas
Copy link
Collaborator

@mo-nikosbaltas mo-nikosbaltas commented Dec 30, 2025

Closes #287

PR creation checklist for the developer

  • Has <issue_number> above ☝️ been replaced with the issue number?
  • Has main been selected as the base branch?
  • Does the feature branch name follow the format <issue_number>_<short_description_of_feature>?
  • Does the text of the PR title exactly match with the text (not including the issue number) of the issue title?
  • Have appropriate reviewers been added to the PR (once it is ready for review)?
  • Has the PR been assigned to the developer(s)?
  • Have the same labels as on the issue (except for the good first issue label) been added to the PR?
  • Has the Climate Model Evaluation Workflow (CMEW) project been added to the PR?
  • Has the appropriate milestone been added to the PR?

Definition of Done for the developer

PR creation checklist for the reviewer

  • Has <issue_number> above ☝️ been replaced with the issue number?
  • Has main been selected as the base branch?
  • Does the feature branch name follow the format <issue_number>_<short_description_of_feature>?
  • Does the text of the PR title exactly match with the text (not including the issue number) of the issue title?
  • Have appropriate reviewers been added to the PR (once it is ready for review)?
  • Has the PR been assigned to the developer(s)?
  • Have the same labels as on the issue (except for the good first issue label) been added to the PR?
  • Has the Climate Model Evaluation Workflow (CMEW) project been added to the PR?
  • Has the appropriate milestone been added to the PR?

Definition of Done for the reviewer

  • Does the change in this PR address the above issue / have all acceptance criteria been met?
  • Does the change in this PR follow the requirements in the wiki: Developer Guide (including copyrights)?
  • Have new tests related to the change been added?
  • Do all the GitHub workflow checks pass?
  • Do all the tests run locally and pass? (Note: the tests are not run by the GitHub workflow, see wiki: Run the tests locally)
  • Has the API documentation (e.g. docstrings in Python modules) related to the change been updated appropriately?
  • Has the user documentation (i.e. everything in the doc directory) related to the change been updated appropriately, including the Quick Start section?
  • Do the HTML pages render correctly? (See wiki: Build the documentation locally)

@mo-nikosbaltas mo-nikosbaltas self-assigned this Dec 30, 2025
@mo-nikosbaltas mo-nikosbaltas added enhancement New feature or request recipe Anything related to ESMValTool rose Anything related to Rose labels Dec 30, 2025
@mo-nikosbaltas
Copy link
Collaborator Author

@alistairsellar could you please provide feedback on the errors encountered. These are not related to the implementation but not availability of data (I think, but not an expert on this!)

When running cylc, run_recipe_radiation_budget failed We get this error from ESMValtool.
Looking at job.out:

ERROR No input files found for Dataset: . dataset: 'HadGEM3-GC31-LL',
project: 'CMIP6', exp: 'historical', ensemble: 'r5i1p1f3' .
.
Looked for files matching
/data/users/managecmip/champ/CMIP6/CMIP/MOHC/HadGEM3-GC31-LL/historical/r5i1
p1f3/Amon/hfls/gn//hfls_Amon_HadGEM3-GC31-LL_historical_r5i1p1f3_gn.nc
/data/users/managecmip/champ/CMIP6/CMIP/NERC/HadGEM3-GC31-LL/historical/r5i1
p1f3/Amon/hfls/gn//hfls_Amon_HadGEM3-GC31-LL_historical_r5i1p1f3_gn.nc
Similar "No input files found" errors are printed for rlds, rls, rss for the reference dataset.

At the end of the file:
ERROR Could not create all tasks
ERROR Missing data for preprocessor seasonal_radiation_budget/hfls: .
dataset: HadGEM3-GC31-LL . ensemble: r5i1p1f3 .
ERROR Not all input files required to run the recipe could be found.
So ESMValTool is telling you:

It is looking in the CHAMP CMIP6 archive under /data/users/managecmip/champ/CMIP6/CMIP/...
It cannot find the HadGEM3-GC31-LL historical r5i1p1f3 files for several variables for 1993.
That is why run_recipe_radiation_budget fails.

Looking at the logs further, #287 does the correct thingy.

Reference dataset (dataset index 0) is now:
dataset: HadGEM3-GC31-LL
project: CMIP6
exp: historical
ensemble: r5i1p1f3
activity: CMIP

Evaluation dataset (dataset index 1) is:
dataset: UKESM1-0-LL
project: ESMVal
exp: amip
activity: ESMVal
ensemble: r1i1p1f1

And in the same log I could see:
ESMValTool does find EVAL data:

Found input files for Dataset: hfls, Amon, ESMVal, UKESM1-0-LL, ESMVal, amip, r1i1p1f1, gn, v20251230 Found input files for Dataset: hfss, Amon, ESMVal, UKESM1-0-LL, ESMVal, amip, r1i1p1f1, gn, v20251230 Found input files for Dataset: rlds, Amon, ESMVal, UKESM1-0-LL, ESMVal, amip, r1i1p1f1, gn, v20251230 . etc.

So:
The EVAL path (the CDDS output under ${ROOT_DATA_DIR}) is correct and being picked up.
The REF dataset is correctly wired to CMIP6 HadGEM3-GC31-LL with the variant REF_VARIANT_LABEL we configured in rose-suite.conf.

The failure is specifically: the CHAMP CMIP6 archive does not contain all the expected files for HadGEM3-GC31-LL, historical, r5i1p1f3, year 1993 for all variables.

The current rose-suite.conf has:

MODEL_ID="UKESM1-0-LL"
VARIANT_LABEL="r1i1p1f1"

REF_MODEL_ID="HadGEM3-GC31-LL"
REF_VARIANT_LABEL="r5i1p1f3"

Alistair, can you check whether the files exist for HadGEM3-GC31-LL, r5i1p1f3. If those files (hfls, rlds, rls, rss at 1993) are missing or stored under a different ensemble, then it seems that we get those errors.

To confirm that the #287 implementation is correct we can align REF_ with EVAL_ and check that it works.
In other words, in rose-suite.conf we set:
REF_MODEL_ID="UKESM1-0-LL"
REF_VARIANT_LABEL="r1i1p1f1"

But if we want to keep the REF settings as, REF_MODEL_ID="HadGEM3-GC31-LL"
REF_VARIANT_LABEL="r5i1p1f3"
Then, you need to check the availability of the data. Otherwise we need to choose a different ensemble.

@alistairsellar
Copy link
Collaborator

It is looking in the CHAMP CMIP6 archive under /data/users/managecmip/champ/CMIP6/CMIP/...

Hrmmm, it is indeed looking there. However it shouldn't be looking there, since that's our mirror of CMIP data and we now want CMEW to be feeding the locally standardised data into ESMValTool, not CMIP data. So ESMValTool should be looking in the cylc-run dir for the standardised data.

@mo-nikosbaltas mo-nikosbaltas changed the title feed model id and variant label to recipe Feed model_id and variant_label to recipe Dec 31, 2025
@mo-nikosbaltas mo-nikosbaltas marked this pull request as ready for review December 31, 2025 10:55
Copy link
Collaborator

@alistairsellar alistairsellar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mo-nikosbaltas, looks good. My requested changes relate to comments and docstrings only.

mo-nikosbaltas and others added 5 commits December 31, 2025 12:53
Co-authored-by: Alistair Sellar <16133375+alistairsellar@users.noreply.github.com>
Co-authored-by: Alistair Sellar <16133375+alistairsellar@users.noreply.github.com>
Co-authored-by: Alistair Sellar <16133375+alistairsellar@users.noreply.github.com>
…b.com:MetOffice/CMEW into 287-feed-model_id-and-variant_label-to-recipe
alistairsellar and others added 6 commits January 5, 2026 19:14
Co-authored-by: Alistair Sellar <16133375+alistairsellar@users.noreply.github.com>
Co-authored-by: Alistair Sellar <16133375+alistairsellar@users.noreply.github.com>
Co-authored-by: Alistair Sellar <16133375+alistairsellar@users.noreply.github.com>
Co-authored-by: Alistair Sellar <16133375+alistairsellar@users.noreply.github.com>
Co-authored-by: Alistair Sellar <16133375+alistairsellar@users.noreply.github.com>
Copy link
Collaborator

@alistairsellar alistairsellar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, my suggestions included trailing spaces, which has broken the GitHub checks. These suggestions remove some of them - hopefully all...

mo-nikosbaltas and others added 2 commits January 6, 2026 08:53
Co-authored-by: Alistair Sellar <16133375+alistairsellar@users.noreply.github.com>
Co-authored-by: Alistair Sellar <16133375+alistairsellar@users.noreply.github.com>
alistairsellar
alistairsellar previously approved these changes Jan 6, 2026
"one for the reference and one for the evaluation run."
)

# Reference dataset: keep existing project/exp/grid but override
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is keeping the existing project or exp? I think it's overwriting them?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have read your explanation of why project and exp are being overwritten, but this comment still says that they are not. I still think the comment needs changing.

{
"dataset": ref_model_id,
"project": "ESMVal",
"exp": "amip",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, I think we're changing this from "historical". Is it deliberate? If so, the comment should be updated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh, I should have picked this up in first review. Actually I think that experiment might be wrong for both runs, including the original. For this recipe (radiation budget) the choice of experiment doesn't make a difference, but it will for some recipes, so experiment should be something that the user defines as part of the model run definition. I've just opened an issue to add that: #316.

For this PR, I propose that we accept that the second run is no more wrong than the first, and that having them consistently called "amip" is as good as any choice. I.e. I propose that we keep "exp": "amip" for both.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think the comment should reflect what's going on, but I will take note to pay more attention to the unchanged code in a review next time.

Copy link
Collaborator Author

@mo-nikosbaltas mo-nikosbaltas Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had spent some time debugging the failure last week when implementing the #287 and I dug out the logs I had kept, so here is the explanation for completion.
There was a failure because the reference dataset was treated as a CMIP6 “historical” run in the recipe, while CDDS had standardised it as a GCModelDev / ESMVal / amip run.
From the ESMValTool log for the reference dataset (dataset index 0, HadGEM3):
'dataset': 'HadGEM3-GC31-LL',
'project': 'ESMVal',
'mip': 'Amon',
'short_name': 'hfls',
'activity': 'CMIP',
'alias': 'None',
'ensemble': 'r5i1p3f3',
'exp': 'historical',
...
So, after executing update_recipe_file.py:
• ‘project’ has been changed to ESMVal
• ‘ensemble’ is r5i1p1f3 (from REF_VARIANT_LABEL)
• But:
o ‘exp’ was still historical
o ‘activity’ was still CMIP
Now looking at where ESMValTool is searching for files (in the logs):
Looked for files matching
/home/users/nikolaos.baltas/cylc-run/CMEW_287/test287c/share/work/GCModelDev/CMIP/MOHC/HadGEM3-GC31-LL/historical/r5i1p1f3/Amon/hfls/gn//hfls_Amon_HadGEM3-GC31-LL_historical_r5i1p1f3_gn.nc
/home/users/nikolaos.baltas/cylc-run/CMEW_287/test287c/share/work/GCModelDev/CMIP/NERC/HadGEM3-GC31-LL/historical/r5i1p1f3/Amon/hfls/gn//hfls_Amon_HadGEM3-GC31-LL_historical_r5i1p1f3_gn.nc
Key bits:
• Path includes GCModelDev/CMIP/.../historical/r5i1p1f3/...
• This is driven by ‘activity: CMIP’ and ‘exp: historical’.
However, the CDDS request (from create_request_file.py) uses:
"experiment_id": "amip",
and is run twice (REF and EVAL) via standardise_model_data. So CDDS is standardising:
• GCModelDev/ESMVal//amip//...
for both runs.
That means:
• CDDS has produced ‘amip’ data
• ESMValTool is still looking for ‘historical’ data for the reference dataset
• Hence: “No input files found for Dataset ... historical ...”
The evaluation dataset works fine because we had explicitly set:
'project': 'ESMVal',
'activity': 'ESMVal',
'exp': 'amip',
'ensemble': 'r1i1p1f1',
...
and ESMValTool finds (from logs):
Found input files for Dataset: hfls, Amon, ESMVal, UKESM1-0-LL, ESMVal, amip, r1i1p1f1, gn, v20251230
So, the fix was to make the reference dataset use the GCModelDev/ESMVal “amip” semantics too.
Need to also override ‘exp’ and ‘activity’ for the ‘reference dataset’ in the same way we do for the evaluation dataset.
I updated that block to:
# Reference dataset: treat as a GCModelDev / ESMVal / amip run,
ref_dataset = datasets[0]
ref_dataset.update(
{
"dataset": ref_model_id,
"project": "ESMVal",
"exp": "amip",
"activity": "ESMVal",
"ensemble": ref_variant,
"start_year": start_year,
"end_year": end_year,
}
)
# Evaluation dataset: ESMVal / amip run using MODEL_ID + VARIANT_LABEL
eval_dataset = datasets[1]
eval_dataset.update(
{
"dataset": eval_model_id,
"project": "ESMVal",
"exp": "amip",
"activity": "ESMVal",
"ensemble": eval_variant,
"start_year": start_year,
"end_year": end_year,
}
)
That aligns both datasets with:
• project: ESMVal
• activity: ESMVal
• exp: amip
which matches what CDDS is actually producing from create_request_file.py.

I hope this answers the question of why overriding 'project' and 'exp' . Now if this is the correct approach we need to investigate further.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy that this comment is resolved from my perspective, but will leave open for @mo-nikosbaltas to close if he and @alistairsellar are satisfied that the "investigate further" aspect has been / is elsewhere addressed.

@NParsonsMO
Copy link
Collaborator

NParsonsMO commented Jan 6, 2026

@mo-nikosbaltas some of the review comments are just queries, but if I'm correct that we are overwriting the experiment (from "historical" to "amip", then the comment should reflect this (the recipe does run successfully with the change, but is the data that it assesses the same data?)

Copy link
Collaborator

@NParsonsMO NParsonsMO left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments on line 80 and line 95 of ‎CMEW/app/configure_for/bin/update_recipe_file.py do not match what is happening.

"one for the reference and one for the evaluation run."
)

# Reference dataset: keep existing project/exp/grid but override
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have read your explanation of why project and exp are being overwritten, but this comment still says that they are not. I still think the comment needs changing.

@NParsonsMO
Copy link
Collaborator

Note: I "mentioned" this issue due to a clipboard paste fail. It is not related to the other issue (#282). Sorry!

Copy link
Collaborator

@NParsonsMO NParsonsMO left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy that the comments are no longer confusing

{
"dataset": ref_model_id,
"project": "ESMVal",
"exp": "amip",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy that this comment is resolved from my perspective, but will leave open for @mo-nikosbaltas to close if he and @alistairsellar are satisfied that the "investigate further" aspect has been / is elsewhere addressed.

Copy link
Collaborator

@alistairsellar alistairsellar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mo-nikosbaltas and @NParsonsMO, all good for me.

Yes, I think that any further investigation needed is covered by #316.

@mo-nikosbaltas mo-nikosbaltas merged commit 37e5ba2 into main Jan 8, 2026
3 checks passed
@mo-nikosbaltas mo-nikosbaltas deleted the 287-feed-model_id-and-variant_label-to-recipe branch January 8, 2026 13:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request recipe Anything related to ESMValTool rose Anything related to Rose

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feed model_id and variant_label to recipe

3 participants